library(tidyverse) #for data wrangling etc
library(rstanarm) #for fitting models in STAN
library(cmdstanr) #for cmdstan
library(brms) #for fitting models in STAN
library(standist) #for exploring distributions
library(coda) #for diagnostics
library(bayesplot) #for diagnostics
library(ggmcmc) #for MCMC diagnostics
library(DHARMa) #for residual diagnostics
library(rstan) #for interfacing with STAN
library(emmeans) #for marginal means etc
library(broom) #for tidying outputs
library(tidybayes) #for more tidying outputs
library(HDInterval) #for HPD intervals
library(ggeffects) #for partial plots
library(broom.mixed)#for summarising models
library(posterior) #for posterior draws
library(ggeffects) #for partial effects plots
library(patchwork) #for multi-panel figures
library(bayestestR) #for ROPE
library(see) #for some plots
library(easystats) #for the easystats ecosystem
theme_set(theme_grey()) #put the default ggplot theme back
source('helperFunctions.R')Bayesian GLM Part3
1 Preparations
Load the necessary libraries
2 Scenario
Here is a modified example from Peake and Quinn (1993). Peake and Quinn (1993) investigated the relationship between the number of individuals of invertebrates living in amongst clumps of mussels on a rocky intertidal shore and the area of those mussel clumps.
| AREA | INDIV |
|---|---|
| 516.00 | 18 |
| 469.06 | 60 |
| 462.25 | 57 |
| 938.60 | 100 |
| 1357.15 | 48 |
| ... | ... |
| AREA | Area of mussel clump mm2 - Predictor variable |
| INDIV | Number of individuals found within clump - Response variable |
The aim of the analysis is to investigate the relationship between mussel clump area and the number of non-mussel invertebrate individuals supported in the mussel clump.
3 Read in the data
Rows: 25 Columns: 2
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
dbl (2): AREA, INDIV
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
peake (25 rows and 2 variables, 2 shown)
ID | Name | Type | Missings | Values | N
---+-------+---------+----------+-----------------+---
1 | AREA | numeric | 0 (0.0%) | [462.25, 27144] | 25
---+-------+---------+----------+-----------------+---
2 | INDIV | numeric | 0 (0.0%) | [18, 1402] | 25
------------------------------------------------------
4 Exploratory data analysis
When exploring these data as part of a frequentist analysis, exploratory data analysis revealed that the both the response (counds of individuals) and predictor (mussel clump area) were skewed and the relationship between raw counds and mussel clump area was not linear. Furthermore, there was strong evidence of a relationship between mean and variance. Normalising both reponse and predictor addressed these issues. However, rather than log transform the response, it was considered more appropriate to model against a distribution that used a logarithmic link function.
The individual observations here (\(y_i\)) are the observed number of (non mussel individuals found in mussel clump \(i\). As a count, these might be expected to follow a Poisson (or perhaps negative binomial) distribution. In the case of a negative binomial, the observed count for any given mussel clump area are expected to be drawn from a negative binomial distribution with a mean of \(\lambda_i\). All the negative binomial distributions are expected to share the same degree of dispersion (\(\theta\)) - that is, the degree to which the inhabitants of mussell clumps aggregate together (or any other reason for overdispersion) is independent of mussel clump area and can be estimated as a constant across all populations.
The natural log of the expected values (\(\lambda_i\)) is modelled against a linear predictor that includes an intercept (\(\beta_0\)) and a slope (\(\beta_1\)) associated with the natural log of mussel area.
The priors on \(\beta_0\) and \(\beta_1\) should be on the natura log scale (since this will be the scale of the parameters). As starting points, we will consider the following priors:
- \(\beta_0\): Normal prior centered at 0 with a variance of 5
- \(\beta_1\): Normal prior centered at 0 with a variance of 2
- \(\theta\): Exponential prior with rate 1
Model formula: \[ \begin{align} y_i &\sim{} \mathcal{NB}(\lambda_i, \theta)\\ ln(\lambda_i) &= \beta_0 + \beta_1 ln(x_i)\\ \beta_0 & \sim\mathcal{N}(6,1.5)\\ \beta_1 & \sim\mathcal{N}(0,1)\\ \theta &\sim{} \mathcal{Exp}(1)\\ OR\\ \theta &\sim{} \mathcal{\Gamma}(2,1)\\ \end{align} \]
5 Fit the model
Call:
glm(formula = INDIV ~ log(AREA), family = poisson(), data = peake)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.013528 0.091147 0.148 0.882
log(AREA) 0.691628 0.009866 70.100 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 7494.5 on 24 degrees of freedom
Residual deviance: 1547.8 on 23 degrees of freedom
AIC: 1738.9
Number of Fisher Scoring iterations: 4
Call:
MASS::glm.nb(formula = INDIV ~ log(AREA), data = peake, init.theta = 7.369263003,
link = log)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.16452 0.53387 -2.181 0.0292 *
log(AREA) 0.82469 0.06295 13.102 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for Negative Binomial(7.3693) family taken to be 1)
Null deviance: 161.076 on 24 degrees of freedom
Residual deviance: 25.941 on 23 degrees of freedom
AIC: 312.16
Number of Fisher Scoring iterations: 1
Theta: 7.37
Std. Err.: 2.16
2 x log-likelihood: -306.16
In rstanarm, the default priors are designed to be weakly informative. They are chosen to provide moderate regularlization (to help prevent overfitting) and help stabalise the computations.
Priors for model 'peake.rstanarm'
------
Intercept (after predictors centered)
~ normal(location = 0, scale = 2.5)
Coefficients
Specified prior:
~ normal(location = 0, scale = 2.5)
Adjusted prior:
~ normal(location = 0, scale = 2)
------
See help('prior_summary.stanreg') for more details
This tells us:
for the intercept, it is using a normal prior with a mean of 0 and a standard deviation of 2.5. The are the defaults for all non-Gaussian intercepts.
for the coefficients (in this case, just the slope), the default prior is a normal prior centered around 0 with a standard deviation of 2.5. For Poisson models, this is then adjusted for the scale of the data by dividing the 2.5 by the standard deviation of the (in this case log) predictor (then rounded).
- there is no auxillary parameter and thus there is no auxillary prior
One way to assess the priors is to have the MCMC sampler sample purely from the prior predictive distribution without conditioning on the observed data. Doing so provides a glimpse at the range of predictions possible under the priors. On the one hand, wide ranging preditions would ensure that the priors are unlikely to influence the actual preditions once they are conditioned on the data. On the other hand, if they are too wide, the sampler is being permitted to traverse into regions of parameter space that are not logically possible in the context of the actual underlying ecological context. Not only could this mean that illogical parameter estimates are possible, when the sampler is traversing regions of paramter space that are not supported by the actual data, the sampler can become unstable and have difficulty.
We can draw from the prior predictive distribution instead of conditioning on the response, by updating the model and indicating prior_PD=TRUE. After refittin the model in this way, we can plot the predictions to gain insights into the range of predictions supported by the priors alone.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Conclusions:
- we see that although the range of predictions are very wide and the slope could range from strongly negative to strongly positive, the choice to have the intercept prior have a mean of 0 results in most of the prior-only posterior draws being dramatically lower than the observed values - indeed some observed values are above the range of the prior-posterior predictions.
The following link provides some guidance about defining priors. [https://github.com/stan-dev/stan/wiki/Prior-Choice-Recommendations]
When defining our own priors, we typically do not want them to be scaled.
If we wanted to define our own priors that were less vague, yet still not likely to bias the outcomes, we could try the following priors (mainly plucket out of thin air):
- \(\beta_0\): normal centred at 6 with a standard deviation of 6
- mean of 6: because (
mean(log(peake$INDIV))) - sd of 2.8: because (
2.5 * sd(log(peake$INDIV)))
- mean of 6: because (
- \(\beta_1\): normal centred at 0 with a standard deviation of 1
- sd of 2.3: because (
sd(log(peake$INDIV))/sd(log(peake$AREA)))
- sd of 2.3: because (
I will also overlay the raw data for comparison.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Now lets refit, conditioning on the data.
Drawing from prior...
Conclusions:
- in each case, the prior is substantially wider than the posterior, suggesting that the posterior is not biased towards the prior.
In brms, the default priors are designed to be weakly informative. They are chosen to provide moderate regularlization (to help prevent overfitting) and help stabalise the computations.
Unlike rstanarm, brms models must be compiled before they start sampling. For most models, the compilation of the stan code takes around 45 seconds.
peake.brm <- brm(bf(INDIV ~ log(AREA), family=poisson()),
data=peake,
iter = 5000, warmup = 1000,
chains = 3, cores = 3,
thin = 5, refresh = 0,
backend = "cmdstan")Start sampling
Running MCMC with 3 parallel chains...
Chain 1 finished in 0.1 seconds.
Chain 2 finished in 0.1 seconds.
Chain 3 finished in 0.1 seconds.
All 3 chains finished successfully.
Mean chain execution time: 0.1 seconds.
Total execution time: 0.3 seconds.
prior class coef group resp dpar nlpar lb ub source
(flat) b default
(flat) b logAREA (vectorized)
student_t(3, 5.8, 2.5) Intercept default
This tells us:
- for the intercept, it is using a student t (flatter normal) prior with a mean of 5.8 and a standard deviation of 2.5. The mean of this prior is based on the median of the log-transformed response and the standard deviation of 2.5 is the default minimum.
for the beta coefficients (in this case, just the slope), the default prior is a inproper flat prior. A flat prior essentially means that any value between negative infinity and positive infinity are equally likely. Whilst this might seem reckless, in practice, it seems to work reasonably well for non-intercept beta parameters.
there is no sigma parameter in a binomial model and therefore there are no additional priors.
One way to assess the priors is to have the MCMC sampler sample purely from the prior predictive distribution without conditioning on the observed data. Doing so provides a glimpse at the range of predictions possible under the priors. On the one hand, wide ranging preditions would ensure that the priors are unlikely to influence the actual preditions once they are conditioned on the data. On the other hand, if they are too wide, the sampler is being permitted to traverse into regions of parameter space that are not logically possible in the context of the actual underlying ecological context. Not only could this mean that illogical parameter estimates are possible, when the sampler is traversing regions of paramter space that are not supported by the actual data, the sampler can become unstable and have difficulty.
In brms, we can inform the sampler to draw from the prior predictive distribution instead of conditioning on the response, by running the model with the sample_prior='only' argument. Unfortunately, this cannot be applied when there are flat priors (since the posteriors will necessarily extend to negative and positive infinity). Therefore, in order to use this useful routine, we need to make sure that we have defined a proper prior for all parameters.
We will adopt a similar approach to that of rstanarm - that is 2.5 * sd(log(peake$INDIV))/sd(log(peake$AREA)) (approx. 2.3).
peake.brm1 = brm(bf(INDIV ~ log(AREA), family=poisson()),
data=peake,
prior=c(
prior(normal(0, 2.3), class='b')),
sample_prior = 'only',
iter = 5000, warmup = 1000,
chains = 3, cores = 3,
thin = 5, refresh = 0,
backend = "cmdstan")Start sampling
Running MCMC with 3 parallel chains...
Chain 1 finished in 0.0 seconds.
Chain 2 finished in 0.0 seconds.
Chain 3 finished in 0.0 seconds.
All 3 chains finished successfully.
Mean chain execution time: 0.0 seconds.
Total execution time: 0.2 seconds.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Conclusions:
- we see that the range of predictions is faily wide and the slope could range from strongly negative to strongly positive.
The following link provides some guidance about defining priors. [https://github.com/stan-dev/stan/wiki/Prior-Choice-Recommendations]
When defining our own priors, we typically do not want them to be scaled.
If we wanted to define our own priors that were less vague, yet still not likely to bias the outcomes, we could try the following priors (mainly plucket out of thin air):
- \(\beta_0\): normal centred at 6 with a standard deviation of 6
- mean of 6: because (
mean(log(peake$INDIV))) - sd of 1.5: because (
sd(log(peake$INDIV)))
- mean of 6: because (
- \(\beta_1\): normal centred at 0 with a standard deviation of 1
- sd of 1: because (
sd(log(peake$INDIV))/sd(log(peake$AREA)))
- sd of 1: because (
I will also overlay the raw data for comparison.
peake.brm2 <- brm(bf(INDIV ~ log(AREA), family=poisson()),
data=peake,
prior=c(
prior(normal(6, 1.5), class='Intercept'),
prior(normal(0, 1), class='b')
),
sample_prior = 'only',
iter = 5000, warmup = 1000,
chains = 3, cores = 3,
thin = 5, refresh = 0,
backend = "cmdstan")Start sampling
Running MCMC with 3 parallel chains...
Chain 1 finished in 0.0 seconds.
Chain 2 finished in 0.0 seconds.
Chain 3 finished in 0.0 seconds.
All 3 chains finished successfully.
Mean chain execution time: 0.0 seconds.
Total execution time: 0.2 seconds.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
The desired updates require recompiling the model
Start sampling
Running MCMC with 3 sequential chains...
Chain 1 finished in 0.1 seconds.
Chain 2 finished in 0.1 seconds.
Chain 3 finished in 0.1 seconds.
All 3 chains finished successfully.
Mean chain execution time: 0.1 seconds.
Total execution time: 0.4 seconds.
$N
[1] 25
$Y
[1] 18 60 57 100 48 118 148 214 225 283 380 278 338 274 569
[16] 509 682 600 978 363 1402 675 1159 1062 632
$K
[1] 2
$X
Intercept logAREA
1 1 6.246107
2 1 6.150731
3 1 6.136106
4 1 6.844389
5 1 7.213142
6 1 7.480800
7 1 7.430120
8 1 7.487896
9 1 8.035949
10 1 8.289067
11 1 8.394989
12 1 8.401037
13 1 8.513765
14 1 8.400853
15 1 8.610818
16 1 8.919481
17 1 8.873303
18 1 9.121503
19 1 9.223560
20 1 9.136445
21 1 9.527275
22 1 9.918414
23 1 10.115073
24 1 10.208912
25 1 10.170373
attr(,"assign")
[1] 0 1
$prior_only
[1] 0
attr(,"class")
[1] "standata" "list"
// generated with brms 2.19.0
functions {
}
data {
int<lower=1> N; // total number of observations
array[N] int Y; // response variable
int<lower=1> K; // number of population-level effects
matrix[N, K] X; // population-level design matrix
int prior_only; // should the likelihood be ignored?
}
transformed data {
int Kc = K - 1;
matrix[N, Kc] Xc; // centered version of X without an intercept
vector[Kc] means_X; // column means of X before centering
for (i in 2 : K) {
means_X[i - 1] = mean(X[ : , i]);
Xc[ : , i - 1] = X[ : , i] - means_X[i - 1];
}
}
parameters {
vector[Kc] b; // population-level effects
real Intercept; // temporary intercept for centered predictors
}
transformed parameters {
real lprior = 0; // prior contributions to the log posterior
lprior += normal_lpdf(b | 0, 1);
lprior += normal_lpdf(Intercept | 6, 1.5);
}
model {
// likelihood including constants
if (!prior_only) {
target += poisson_log_glm_lpmf(Y | Xc, Intercept, b);
}
// priors including constants
target += lprior;
}
generated quantities {
// actual population-level intercept
real b_Intercept = Intercept - dot_product(means_X, b);
// additionally sample draws from priors
real prior_b = normal_rng(0, 1);
real prior_Intercept = normal_rng(6, 1.5);
}
6 MCMC sampling diagnostics
The bayesplot package offers a range of MCMC diagnostics as well as Posterior Probability Checks (PPC), all of which have a convenient plot() interface. Lets start with the MCMC diagnostics.
See list of available diagnostics by name
bayesplot MCMC module:
mcmc_acf
mcmc_acf_bar
mcmc_areas
mcmc_areas_ridges
mcmc_combo
mcmc_dens
mcmc_dens_chains
mcmc_dens_overlay
mcmc_hex
mcmc_hist
mcmc_hist_by_chain
mcmc_intervals
mcmc_neff
mcmc_neff_hist
mcmc_nuts_acceptance
mcmc_nuts_divergence
mcmc_nuts_energy
mcmc_nuts_stepsize
mcmc_nuts_treedepth
mcmc_pairs
mcmc_parcoord
mcmc_rank_ecdf
mcmc_rank_hist
mcmc_rank_overlay
mcmc_recover_hist
mcmc_recover_intervals
mcmc_recover_scatter
mcmc_rhat
mcmc_rhat_hist
mcmc_scatter
mcmc_trace
mcmc_trace_highlight
mcmc_violin
Of these, we will focus on:
- mcmc_trace: this plots the estimates of each parameter over the post-warmup length of each MCMC chain. Each chain is plotted in a different shade of blue, with each parameter in its own facet. Ideally, each trace should just look like noise without any discernible drift and each of the traces for a specific parameter should look the same (i.e, should not be displaced above or below any other trace for that parameter).
The chains appear well mixed and very similar
- acf (autocorrelation function): plots the autocorrelation between successive MCMC sample lags for each parameter and each chain
Warning: The `facets` argument of `facet_grid()` is deprecated as of ggplot2 2.2.0.
ℹ Please use the `rows` argument instead.
ℹ The deprecated feature was likely used in the bayesplot package.
Please report the issue at <https://github.com/stan-dev/bayesplot/issues/>.
There is no evidence of autocorrelation in the MCMC samples
- Rhat: Rhat is a measure of convergence between the chains. The closer the values are to 1, the more the chains have converged. Values greater than 1.05 indicate a lack of convergence. There will be an Rhat value for each parameter estimated.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
All Rhat values are below 1.05, suggesting the chains have converged.
neff (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Ratios all very high.
The rstan package offers a range of MCMC diagnostics. Lets start with the MCMC diagnostics.
Of these, we will focus on:
- stan_trace: this plots the estimates of each parameter over the post-warmup length of each MCMC chain. Each chain is plotted in a different colour, with each parameter in its own facet. Ideally, each trace should just look like noise without any discernible drift and each of the traces for a specific parameter should look the same (i.e, should not be displaced above or below any other trace for that parameter).
The chains appear well mixed and very similar
- stan_acf (autocorrelation function): plots the autocorrelation between successive MCMC sample lags for each parameter and each chain
There is no evidence of autocorrelation in the MCMC samples
- stan_rhat: Rhat is a scale reduction factor measure of convergence between the chains. The closer the values are to 1, the more the chains have converged. Values greater than 1.05 indicate a lack of convergence. There will be an Rhat value for each parameter estimated.
All Rhat values are below 1.05, suggesting the chains have converged.
stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
Ratios all very high.
The ggmean package also has a set of MCMC diagnostic functions. Lets start with the MCMC diagnostics.
Of these, we will focus on:
- ggs_traceplot: this plots the estimates of each parameter over the post-warmup length of each MCMC chain. Each chain is plotted in a different colour, with each parameter in its own facet. Ideally, each trace should just look like noise without any discernible drift and each of the traces for a specific parameter should look the same (i.e, should not be displaced above or below any other trace for that parameter).
The chains appear well mixed and very similar
- gss_autocorrelation (autocorrelation function): plots the autocorrelation between successive MCMC sample lags for each parameter and each chain
There is no evidence of autocorrelation in the MCMC samples
- stan_rhat: Rhat is a scale reduction factor measure of convergence between the chains. The closer the values are to 1, the more the chains have converged. Values greater than 1.05 indicate a lack of convergence. There will be an Rhat value for each parameter estimated.
All Rhat values are below 1.05, suggesting the chains have converged.
stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
dplyr 1.1.0.
ℹ Please use `reframe()` instead.
ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
always returns an ungrouped data frame and adjust accordingly.
ℹ The deprecated feature was likely used in the ggmcmc package.
Please report the issue at <https://github.com/xfim/ggmcmc/issues/>.
Ratios all very high.
The bayesplot package offers a range of MCMC diagnostics as well as Posterior Probability Checks (PPC), all of which have a convenient plot() interface. Lets start with the MCMC diagnostics.
See list of available diagnostics by name
bayesplot MCMC module:
mcmc_acf
mcmc_acf_bar
mcmc_areas
mcmc_areas_ridges
mcmc_combo
mcmc_dens
mcmc_dens_chains
mcmc_dens_overlay
mcmc_hex
mcmc_hist
mcmc_hist_by_chain
mcmc_intervals
mcmc_neff
mcmc_neff_hist
mcmc_nuts_acceptance
mcmc_nuts_divergence
mcmc_nuts_energy
mcmc_nuts_stepsize
mcmc_nuts_treedepth
mcmc_pairs
mcmc_parcoord
mcmc_rank_ecdf
mcmc_rank_hist
mcmc_rank_overlay
mcmc_recover_hist
mcmc_recover_intervals
mcmc_recover_scatter
mcmc_rhat
mcmc_rhat_hist
mcmc_scatter
mcmc_trace
mcmc_trace_highlight
mcmc_violin
Of these, we will focus on:
- trace: this plots the estimates of each parameter over the post-warmup length of each MCMC chain. Each chain is plotted in a different shade of blue, with each parameter in its own facet. Ideally, each trace should just look like noise without any discernible drift and each of the traces for a specific parameter should look the same (i.e, should not be displaced above or below any other trace for that parameter).
The chains appear well mixed and very similar
- acf_bar (autocorrelation function): plots the autocorrelation between successive MCMC sample lags for each parameter and each chain
There is no evidence of autocorrelation in the MCMC samples
- rhat_hist: Rhat is a scale reduction factor measure of convergence between the chains. The closer the values are to 1, the more the chains have converged. Values greater than 1.05 indicate a lack of convergence. There will be an Rhat value for each parameter estimated.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
All Rhat values are below 1.05, suggesting the chains have converged.
neff_hist (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Ratios all very high.
The rstan package offers a range of MCMC diagnostics. Lets start with the MCMC diagnostics.
Of these, we will focus on:
- stan_trace: this plots the estimates of each parameter over the post-warmup length of each MCMC chain. Each chain is plotted in a different colour, with each parameter in its own facet. Ideally, each trace should just look like noise without any discernible drift and each of the traces for a specific parameter should look the same (i.e, should not be displaced above or below any other trace for that parameter).
The chains appear well mixed and very similar
- stan_acf (autocorrelation function): plots the autocorrelation between successive MCMC sample lags for each parameter and each chain
There is no evidence of autocorrelation in the MCMC samples
- stan_rhat: Rhat is a scale reduction factor measure of convergence between the chains. The closer the values are to 1, the more the chains have converged. Values greater than 1.05 indicate a lack of convergence. There will be an Rhat value for each parameter estimated.
All Rhat values are below 1.05, suggesting the chains have converged.
stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
Ratios all very high.
The ggmean package also has a set of MCMC diagnostic functions. Lets start with the MCMC diagnostics.
Of these, we will focus on:
- ggs_traceplot: this plots the estimates of each parameter over the post-warmup length of each MCMC chain. Each chain is plotted in a different colour, with each parameter in its own facet. Ideally, each trace should just look like noise without any discernible drift and each of the traces for a specific parameter should look the same (i.e, should not be displaced above or below any other trace for that parameter).
The chains appear well mixed and very similar
- gss_autocorrelation (autocorrelation function): plots the autocorrelation between successive MCMC sample lags for each parameter and each chain
There is no evidence of autocorrelation in the MCMC samples
- stan_rhat: Rhat is a scale reduction factor measure of convergence between the chains. The closer the values are to 1, the more the chains have converged. Values greater than 1.05 indicate a lack of convergence. There will be an Rhat value for each parameter estimated.
All Rhat values are below 1.05, suggesting the chains have converged.
stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
Ratios all very high.
7 Model validation
Post predictive checks provide additional diagnostics about the fit of the model. Specifically, they provide a comparison between predictions drawn from the model and the observed data used to train the model.
See list of available diagnostics by name
bayesplot PPC module:
ppc_bars
ppc_bars_grouped
ppc_boxplot
ppc_dens
ppc_dens_overlay
ppc_dens_overlay_grouped
ppc_ecdf_overlay
ppc_ecdf_overlay_grouped
ppc_error_binned
ppc_error_hist
ppc_error_hist_grouped
ppc_error_scatter
ppc_error_scatter_avg
ppc_error_scatter_avg_grouped
ppc_error_scatter_avg_vs_x
ppc_freqpoly
ppc_freqpoly_grouped
ppc_hist
ppc_intervals
ppc_intervals_grouped
ppc_km_overlay
ppc_km_overlay_grouped
ppc_loo_intervals
ppc_loo_pit
ppc_loo_pit_overlay
ppc_loo_pit_qq
ppc_loo_ribbon
ppc_pit_ecdf
ppc_pit_ecdf_grouped
ppc_ribbon
ppc_ribbon_grouped
ppc_rootogram
ppc_scatter
ppc_scatter_avg
ppc_scatter_avg_grouped
ppc_stat
ppc_stat_2d
ppc_stat_freqpoly
ppc_stat_freqpoly_grouped
ppc_stat_grouped
ppc_violin_grouped
- dens_overlay: plots the density distribution of the observed data (black line) overlayed ontop of 50 density distributions generated from draws from the model (light blue). Ideally, the 50 realisations should be roughly consistent with the observed data.
The model draws appear deviate from the observed data.
- error_scatter_avg: this plots the observed values against the average residuals. Similar to a residual plot, we do not want to see any patterns in this plot. There is some pattern remaining in these residuals.
The predictive error seems to be related to the predictor - the model performs poorest at higher mussel clump areas.
- error_scatter_avg_vs_x: this is similar to a regular residual plot and as such should be interpreted as such. Again, this is not interpretable for binary data.
- intervals: plots the observed data overlayed ontop of posterior predictions associated with each level of the predictor. Ideally, the observed data should all fall within the predictive intervals.
The modelled preditions seem to underestimate the uncertainty with increasing mussel clump area.
- ribbon: this is just an alternative way of expressing the above plot.
The shinystan package allows the full suite of MCMC diagnostics and posterior predictive checks to be accessed via a web interface.
DHARMa residuals provide very useful diagnostics. Unfortunately, we cannot directly use the simulateResiduals() function to generate the simulated residuals. However, if we are willing to calculate some of the components ourself, we can still obtain the simulated residuals from the fitted stan model.
We need to supply:
- simulated (predicted) responses associated with each observation.
- observed values
- fitted (predicted) responses (averaged) associated with each observation
preds <- posterior_predict(peake.rstanarm3, ndraws=250, summary=FALSE)
peake.resids <- createDHARMa(simulatedResponse = t(preds),
observedResponse = peake$INDIV,
fittedPredictedResponse = apply(preds, 2, median),
integerResponse = TRUE)
plot(peake.resids)
DHARMa nonparametric dispersion test via sd of residuals fitted vs.
simulated
data: simulationOutput
dispersion = 95.743, p-value < 2.2e-16
alternative hypothesis: two.sided
Conclusions:
- the simulated residuals suggest a general lack of fit due to overdispersion and outliers
- perhaps we should explore a negative binomial model
Post predictive checks provide additional diagnostics about the fit of the model. Specifically, they provide a comparison between predictions drawn from the model and the observed data used to train the model.
See list of available diagnostics by name
bayesplot PPC module:
ppc_bars
ppc_bars_grouped
ppc_boxplot
ppc_dens
ppc_dens_overlay
ppc_dens_overlay_grouped
ppc_ecdf_overlay
ppc_ecdf_overlay_grouped
ppc_error_binned
ppc_error_hist
ppc_error_hist_grouped
ppc_error_scatter
ppc_error_scatter_avg
ppc_error_scatter_avg_grouped
ppc_error_scatter_avg_vs_x
ppc_freqpoly
ppc_freqpoly_grouped
ppc_hist
ppc_intervals
ppc_intervals_grouped
ppc_km_overlay
ppc_km_overlay_grouped
ppc_loo_intervals
ppc_loo_pit
ppc_loo_pit_overlay
ppc_loo_pit_qq
ppc_loo_ribbon
ppc_pit_ecdf
ppc_pit_ecdf_grouped
ppc_ribbon
ppc_ribbon_grouped
ppc_rootogram
ppc_scatter
ppc_scatter_avg
ppc_scatter_avg_grouped
ppc_stat
ppc_stat_2d
ppc_stat_freqpoly
ppc_stat_freqpoly_grouped
ppc_stat_grouped
ppc_violin_grouped
- dens_overlay: plots the density distribution of the observed data (black line) overlayed ontop of 50 density distributions generated from draws from the model (light blue). Ideally, the 50 realisations should be roughly consistent with the observed data.
The model draws appear deviate from the observed data.
- error_scatter_avg: this plots the observed values against the average residuals. Similar to a residual plot, we do not want to see any patterns in this plot. There is some pattern remaining in these residuals.
Using all posterior draws for ppc type 'error_scatter_avg' by default.
The predictive error seems to be related to the predictor - the model performs poorest at higher mussel clump areas.
- error_scatter_avg_vs_x: this is similar to a regular residual plot and as such should be interpreted as such. Again, this is not interpretable for binary data.
Using all posterior draws for ppc type 'error_scatter_avg_vs_x' by default.
- intervals: plots the observed data overlayed ontop of posterior predictions associated with each level of the predictor. Ideally, the observed data should all fall within the predictive intervals.
Using all posterior draws for ppc type 'intervals' by default.
The modelled preditions seem to underestimate the uncertainty with increasing mussel clump area.
- ribbon: this is just an alternative way of expressing the above plot.
Using all posterior draws for ppc type 'ribbon' by default.
The shinystan package allows the full suite of MCMC diagnostics and posterior predictive checks to be accessed via a web interface.
DHARMa residuals provide very useful diagnostics. Unfortunately, we cannot directly use the simulateResiduals() function to generate the simulated residuals. However, if we are willing to calculate some of the components ourself, we can still obtain the simulated residuals from the fitted stan model.
We need to supply:
- simulated (predicted) responses associated with each observation.
- observed values
- fitted (predicted) responses (averaged) associated with each observation
preds <- posterior_predict(peake.brm3, ndraws=250, summary=FALSE)
peake.resids <- createDHARMa(simulatedResponse = t(preds),
observedResponse = peake$INDIV,
fittedPredictedResponse = apply(preds, 2, median),
integerResponse = TRUE)
plot(peake.resids)
DHARMa nonparametric dispersion test via sd of residuals fitted vs.
simulated
data: simulationOutput
dispersion = 96.263, p-value < 2.2e-16
alternative hypothesis: two.sided
peake.resids <- make_brms_dharma_res(peake.brm3, integerResponse = FALSE)
wrap_elements(~testUniformity(peake.resids)) +
wrap_elements(~plotResiduals(peake.resids, form = factor(rep(1, nrow(peake))))) +
wrap_elements(~plotResiduals(peake.resids, quantreg = FALSE)) +
wrap_elements(~testDispersion(peake.resids))Conclusions:
- the simulated residuals suggest a general lack of fit due to overdispersion and outliers
- perhaps we should explore a negative binomial model
8 Fit Negative Binomial rstanarm
Drawing from prior...
Conclusions:
- in each case, the prior is substantially wider than the posterior, suggesting that the posterior is not biased towards the prior.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning: The following arguments were unrecognized and ignored: x
Different models can be compared via information criterion. LOO (Leave One Out) IC is similar to AIC except that AIC does not consider priors and assumes that the posterior likelihood is multivariate normal. LOO AIC does not and integrates over all uncertainty.
- ELPD: is the theoretical expected log pointwise predictive density for a new dataset and can be estimated via leave-one-out cross-validation.
elpd_loo: is the Bayesian LOO estimate of ELPDSEofelpd_loo: is the Monte Carlo standard error is the estimate for the computational accuracy of MCMC and importance sampling used to computeelpd_loop_loo: is the difference betweenelpd_looand the estimated from non-cross validated ELPD and therefore is a measure of the relative extra difficulty of predicting new data compared to the observed data. It can also be a meaure of the effective number of parameters:- in a well behaved model,
p_loowill be less than the number of estimated parameters and the total sample size. - in a misspecified model,
p_loowill be greater than the number of estimated parameters and the total sample size.
- in a well behaved model,
Pareto k: is a relative measure of the influence of each observation as well as the accuracy with which the associated leave one out calculations can be estimated.- when
k < 0.5: the corresponding components can be accurately estimated - when
0.5 < k < 0.7: the accuracy is lower but still acceptable - when
k>0.7: the accuracy is too low andelpd_loois unreliable. This can also suggest a misspecified model.
- when
More information about the above can be gleaned from [https://mc-stan.org/loo/reference/loo-glossary.html].
Start with the Poisson model
Warning: Found 10 observation(s) with a pareto_k > 0.7. We recommend calling 'loo' again with argument 'k_threshold = 0.7' in order to calculate the ELPD without the assumption that these observations are negligible. This will refit the model 10 times to compute the ELPDs for the problematic observations directly.
Computed from 2400 by 25 log-likelihood matrix
Estimate SE
elpd_loo -928.3 296.7
p_loo 118.3 45.4
looic 1856.5 593.4
------
Monte Carlo SE of elpd_loo is NA.
Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 14 56.0% 951
(0.5, 0.7] (ok) 1 4.0% 497
(0.7, 1] (bad) 5 20.0% 55
(1, Inf) (very bad) 5 20.0% 1
See help('pareto-k-diagnostic') for details.
Conclusions:
p_loois substantially higher than the number of estimated parameters- there are some
Pareto kvalues identified as bad or even very bad
Now for the Negative Binomial model
Computed from 2400 by 25 log-likelihood matrix
Estimate SE
elpd_loo -156.8 5.3
p_loo 1.9 0.5
looic 313.6 10.7
------
Monte Carlo SE of elpd_loo is 0.0.
All Pareto k estimates are good (k < 0.5).
See help('pareto-k-diagnostic') for details.
Conclusions:
p_loois lower than the number of estimated parameters- there are no bad
Pareto kvalues
We can also compare the models.
The difference in Expected Log Pointwise Probability Density (elpd_dff) computes the difference in elpd_loo of two models (expressed relative to the higher model). The difference in elpd_diff will be negative if the expected out-of-sample predictive accuracy of the first model is higher. If the difference is be positive then the second model is preferred.
Note, the above assumes that both models are valid. In the current case, we know that the Poisson model was overdispersed and thus it should not be a genuine candidate. Nevertheless, we will compare the Poisson to the Negative Binomial for illustrative purposes.
elpd_diff se_diff
peake.rstanarm4 0.0 0.0
peake.rstanarm3 -771.5 293.1
Conclusions:
- the
elpd_diffis negative, indicating that the first model (Negative Binomial) is better.
peake.brm4 = brm(bf(INDIV ~ log(AREA), family=negbinomial()),
data=peake,
prior=c(
prior(normal(6, 1.5), class='Intercept'),
prior(normal(0, 1), class='b'),
## prior(gamma(0.01, 0.01), class='shape')
prior(gamma(2, 1), class='shape')
),
sample_prior=TRUE,
iter = 5000, warmup = 1000,
chains = 3, cores = 3,
thin = 5, refresh = 0,
backend = "cmdstan")Start sampling
Running MCMC with 3 parallel chains...
Chain 1 finished in 0.1 seconds.
Chain 2 finished in 0.1 seconds.
Chain 3 finished in 0.1 seconds.
All 3 chains finished successfully.
Mean chain execution time: 0.1 seconds.
Total execution time: 0.2 seconds.
Conclusions:
- in each case, the prior is substantially wider than the posterior, suggesting that the posterior is not biased towards the prior.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning: The following arguments were unrecognized and ignored: x
Using all posterior draws for ppc type 'intervals' by default.
preds <- posterior_predict(peake.brm4, ndraws=250, summary=FALSE)
peake.resids <- createDHARMa(simulatedResponse = t(preds),
observedResponse = peake$INDIV,
fittedPredictedResponse = apply(preds, 2, median),
integerResponse = TRUE)
plot(peake.resids)peake.resids <- make_brms_dharma_res(peake.brm4, integerResponse = FALSE)
wrap_elements(~testUniformity(peake.resids)) +
wrap_elements(~plotResiduals(peake.resids, form = factor(rep(1, nrow(peake))))) +
wrap_elements(~plotResiduals(peake.resids, quantreg = FALSE)) +
wrap_elements(~testDispersion(peake.resids))Different models can be compared via information criterion. LOO (Leave One Out) IC is similar to AIC except that AIC does not consider priors and assumes that the posterior likelihood is multivariate normal. LOO AIC does not and integrates over all uncertainty.
- ELPD: is the theoretical expected log pointwise predictive density for a new dataset and can be estimated via leave-one-out cross-validation.
elpd_loo: is the Bayesian LOO estimate of ELPDSEofelpd_loo: is the Monte Carlo standard error is the estimate for the computational accuracy of MCMC and importance sampling used to computeelpd_loop_loo: is the difference betweenelpd_looand the estimated from non-cross validated ELPD and therefore is a measure of the relative extra difficulty of predicting new data compared to the observed data. It can also be a meaure of the effective number of parameters:- in a well behaved model,
p_loowill be less than the number of estimated parameters and the total sample size. - in a misspecified model,
p_loowill be greater than the number of estimated parameters and the total sample size.
- in a well behaved model,
Pareto k: is a relative measure of the influence of each observation as well as the accuracy with which the associated leave one out calculations can be estimated.- when
k < 0.5: the corresponding components can be accurately estimated - when
0.5 < k < 0.7: the accuracy is lower but still acceptable - when
k>0.7: the accuracy is too low andelpd_loois unreliable. This can also suggest a misspecified model.
- when
More information about the above can be gleaned from [https://mc-stan.org/loo/reference/loo-glossary.html].
Start with the Poisson model
Warning: Found 6 observations with a pareto_k > 0.7 in model 'peake.brm3'. It
is recommended to set 'moment_match = TRUE' in order to perform moment matching
for problematic observations.
Computed from 2400 by 25 log-likelihood matrix
Estimate SE
elpd_loo -927.8 297.8
p_loo 117.0 46.6
looic 1855.6 595.6
------
Monte Carlo SE of elpd_loo is NA.
Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 15 60.0% 595
(0.5, 0.7] (ok) 4 16.0% 95
(0.7, 1] (bad) 2 8.0% 64
(1, Inf) (very bad) 4 16.0% 1
See help('pareto-k-diagnostic') for details.
Conclusions:
p_loois substantially higher than the number of estimated parameters- there are some
Pareto kvalues identified as bad or even very bad
Now for the Negative Binomial model
Computed from 2400 by 25 log-likelihood matrix
Estimate SE
elpd_loo -156.5 5.4
p_loo 2.0 0.5
looic 312.9 10.8
------
Monte Carlo SE of elpd_loo is 0.0.
All Pareto k estimates are good (k < 0.5).
See help('pareto-k-diagnostic') for details.
Conclusions:
p_loois lower than the number of estimated parameters- there are no bad
Pareto kvalues
We can also compare the models.
The difference in Expected Log Pointwise Probability Density (elpd_dff) computes the difference in elpd_loo of two models (expressed relative to the higher model). The difference in elpd_diff will be negative if the expected out-of-sample predictive accuracy of the first model is higher. If the difference is be positive then the second model is preferred.
Note, the above assumes that both models are valid. In the current case, we know that the Poisson model was overdispersed and thus it should not be a genuine candidate. Nevertheless, we will compare the Poisson to the Negative Binomial for illustrative purposes.
elpd_diff se_diff
peake.brm4 0.0 0.0
peake.brm3 -771.3 294.1
Conclusions:
- the
elpd_diffis negative, indicating that the first model (Negative Binomial) is better.
9 Partial effects plots
peake.rstanarm4 |> fitted_draws(newdata=peake) |>
median_hdci() |>
ggplot(aes(x=AREA, y=.value)) +
geom_ribbon(aes(ymin=.lower, ymax=.upper), fill='blue', alpha=0.3) +
geom_line() +
geom_point(data=peake, aes(y=INDIV, x=AREA))Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
Use [add_]epred_draws() to get the expectation of the posterior predictive.
Use [add_]linpred_draws() to get the distribution of the linear predictor.
For example, you used [add_]fitted_draws(..., scale = "response"), which
means you most likely want [add_]epred_draws(...).
10 Model investigation
The summary() method generates simple summaries (mean, standard deviation as well as 10, 50 and 90 percentiles).
Model Info:
function: stan_glm
family: neg_binomial_2 [log]
formula: INDIV ~ log(AREA)
algorithm: sampling
sample: 2400 (posterior sample size)
priors: see help('prior_summary')
observations: 25
predictors: 2
Estimates:
mean sd 10% 50% 90%
(Intercept) -1.2 0.7 -2.1 -1.2 -0.3
log(AREA) 0.8 0.1 0.7 0.8 0.9
reciprocal_dispersion 4.6 1.3 3.1 4.5 6.3
Fit Diagnostics:
mean sd 10% 50% 90%
mean_PPD 480.9 89.6 375.6 473.3 597.4
The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).
MCMC diagnostics
mcse Rhat n_eff
(Intercept) 0.0 1.0 2306
log(AREA) 0.0 1.0 2306
reciprocal_dispersion 0.0 1.0 2553
mean_PPD 1.9 1.0 2151
log-posterior 0.0 1.0 2248
For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
Conclusions:
- in the Model Info, we are informed that the total MCMC posterior sample size is 2400 and that there were 19 raw observations.
- the estimated intercept (expected number of individuals for a mussel clump of log-transformed area = 0) is -1.17. This is the mean of the posterior distribution for this parameter. If we back-transform this to the response scale, this becomes 0.31. When the mussel clump is 0, we expect to find 0.31 individuals in the mussel clump (which obviously does not exist if it has a log-transformed area of 0!).
- the estimated slope (rate at which the number of individuals changes per 1 unit change in log-transformed area), is 0.83 (mean) or 0.72 (median) with a standard deviation of 0. The 90% credibility intervals indicate that we are 90% confident that the slope is between 0.83 and 0.83 - e.g. there is a significant positive trend. When back-transformed onto the reponse scale, we see that for a one unit increase in log-transformed mussel clump area, the number of individuals increases by a factor of 2.29. This represents a 129% increase per 1 unit change in log-transformed mussel clump area.
- Rhat and number of effective samples for each parameter are also provided as MCMC diagnostics and all look good.
peake.rstanarm4$stanfit |>
summarise_draws(median,
HDInterval::hdi,
rhat, length, ess_bulk, ess_tail)# A tibble: 5 × 8
variable median lower upper rhat length ess_bulk ess_tail
<chr> <num> <num> <num> <num> <num> <num> <num>
1 (Intercept) -1.18 -2.62 0.179 0.999 2400 2319. 2343.
2 log(AREA) 0.826 0.659 0.994 0.999 2400 2318. 2352.
3 reciprocal_dispersi… 4.48 2.28 7.15 1.00 2400 2542. 2354.
4 mean_PPD 473. 317. 662. 1.00 2400 2195. 2065.
5 log-posterior -161. -164. -160. 0.999 2400 2232. 2315.
We can also alter the CI level.
peake.rstanarm4$stanfit |>
summarise_draws(median,
~HDInterval::hdi(.x, credMass = 0.9),
rhat, length, ess_bulk, ess_tail)# A tibble: 5 × 8
variable median lower upper rhat length ess_bulk ess_tail
<chr> <num> <num> <num> <num> <num> <num> <num>
1 (Intercept) -1.18 -2.43 -6.81e-2 0.999 2400 2319. 2343.
2 log(AREA) 0.826 0.690 9.68e-1 0.999 2400 2318. 2352.
3 reciprocal_dispersi… 4.48 2.46 6.56e+0 1.00 2400 2542. 2354.
4 mean_PPD 473. 337. 6.21e+2 1.00 2400 2195. 2065.
5 log-posterior -161. -163. -1.60e+2 0.999 2400 2232. 2315.
Arguably, it would be better to back-transform to the ratio scale
peake.rstanarm4$stanfit |>
## tidy_draws() |>
## exp() |>
summarise_draws(
~ median(exp(.x)),
~HDInterval::hdi(exp(.x)),
rhat, length, ess_bulk, ess_tail)# A tibble: 5 × 8
variable `~median(exp(.x))` lower upper rhat length ess_bulk ess_tail
<chr> <num> <num> <num> <num> <num> <num> <num>
1 (Interc… 3.07e- 1 2.40e- 2 1.03e+ 0 0.999 2400 2319. 2343.
2 log(ARE… 2.29e+ 0 1.93e+ 0 2.70e+ 0 0.999 2400 2318. 2352.
3 recipro… 8.85e+ 1 5.24e+ 0 1.04e+ 3 1.00 2400 2542. 2354.
4 mean_PPD 3.63e+205 7.64e+103 2.60e+280 1.00 2400 2195. 2065.
5 log-pos… 7.55e- 71 9.41e- 75 2.16e- 70 0.999 2400 2232. 2315.
tidyMCMC(peake.rstanarm4$stanfit, estimate.method='median', conf.int=TRUE, conf.method='HPDinterval', rhat=TRUE, ess=TRUE)# A tibble: 5 × 7
term estimate std.error conf.low conf.high rhat ess
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 (Intercept) -1.18 0.733 -2.62 0.179 0.999 2306
2 log(AREA) 0.826 0.0867 0.659 0.994 0.999 2306
3 reciprocal_dispersion 4.48 1.27 2.28 7.15 1.00 2553
4 mean_PPD 473. 89.6 317. 662. 1.00 2151
5 log-posterior -161. 1.27 -164. -160. 0.999 2248
Conclusions:
- the estimated intercept (expected number of individuals for a mussel clump of log-transformed area = 0) is -1.18. This is the median of the posterior distribution for this parameter. If we back-transform this to the response scale, this becomes 0.31. When the mussel clump is 0, we expect to find 0.31 individuals in the mussel clump (which obviously does not exist if it has a log-transformed area of 0!).
- the estimated slope (rate at which the number of individuals changes per 1 unit change in log-transformed area), is 0.83 (mean) with a standard error of 0.09. The 95% credibility intervals indicate that we are 90% confident that the slope is between 0.66 and 0.99 - e.g. there is a significant positive trend. When back-transformed onto the reponse scale, we see that for a one unit increase in log-transformed mussel clump area, the number of individuals increases by a factor of 2.29. This represents a 129% increase per 1 unit change in log-transformed mussel clump area.
- Rhat and number of effective samples for each parameter are also provided as MCMC diagnostics and all look good.
# A draws_df: 800 iterations, 3 chains, and 5 variables
(Intercept) log(AREA) reciprocal_dispersion mean_PPD log-posterior
1 -1.09 0.80 3.0 450 -162
2 -1.15 0.83 6.3 539 -161
3 -0.87 0.80 5.4 552 -161
4 -0.78 0.79 4.1 619 -161
5 -1.78 0.90 6.5 551 -161
6 -2.01 0.93 5.9 614 -162
7 -2.13 0.95 8.0 574 -165
8 -0.39 0.74 3.0 453 -162
9 -1.23 0.84 5.0 362 -160
10 -1.14 0.83 5.1 520 -161
# ... with 2390 more draws
# ... hidden reserved variables {'.chain', '.iteration', '.draw'}
## summarised
peake.rstanarm4$stanfit |>
as_draws_df() |>
exp() |>
summarise_draws(median,
HDInterval::hdi,
rhat,
ess_bulk, ess_tail)# A tibble: 5 × 7
variable median lower upper rhat ess_bulk ess_tail
<chr> <num> <num> <num> <num> <num> <num>
1 (Intercept) 3.07e- 1 2.40e- 2 1.03e+ 0 0.999 2319. 2343.
2 log(AREA) 2.29e+ 0 1.93e+ 0 2.70e+ 0 0.999 2318. 2352.
3 reciprocal_dispersion 8.85e+ 1 5.24e+ 0 1.04e+ 3 1.00 2542. 2354.
4 mean_PPD 3.63e+205 7.64e+103 2.60e+280 1.00 2199. NA
5 log-posterior 7.55e- 71 9.41e- 75 2.16e- 70 0.999 2232. NA
## summarised on fractional scale
peake.rstanarm4$stanfit |>
as_draws_df() |>
dplyr::select(matches('Intercept|AREA')) |>
summarise_draws(median,
HDInterval::hdi,
rhat,
length,
ess_bulk, ess_tail)Warning: Dropping 'draws_df' class as required metadata was removed.
# A tibble: 2 × 8
variable median lower upper rhat length ess_bulk ess_tail
<chr> <num> <num> <num> <num> <num> <num> <num>
1 (Intercept) -1.18 -2.62 0.179 1.00 2400 2319. 2326.
2 log(AREA) 0.826 0.659 0.994 1.00 2400 2318. 2344.
## OR
names <- peake.rstanarm4 |>
formula() |>
model.matrix(peake) |>
colnames()
peake.rstanarm4$stanfit |>
as_draws_df() |>
dplyr::select(any_of(names)) |>
mutate(across(everything(), exp)) |>
summarise_draws(median,
HDInterval::hdi,
rhat,
length,
ess_bulk, ess_tail)Warning: Dropping 'draws_df' class as required metadata was removed.
# A tibble: 2 × 8
variable median lower upper rhat length ess_bulk ess_tail
<chr> <num> <num> <num> <num> <num> <num> <num>
1 (Intercept) 0.307 0.0240 1.03 1.00 2400 2319. 2326.
2 log(AREA) 2.29 1.93 2.70 1.00 2400 2318. 2344.
Due to the presence of a log transform in the predictor, it is better to use the regex version.
[1] "(Intercept)" "log(AREA)" "reciprocal_dispersion"
[4] "accept_stat__" "stepsize__" "treedepth__"
[7] "n_leapfrog__" "divergent__" "energy__"
# A tibble: 4,800 × 5
# Groups: .variable [2]
.chain .iteration .draw .variable .value
<int> <int> <int> <chr> <dbl>
1 1 1 1 (Intercept) -1.09
2 1 2 2 (Intercept) -1.15
3 1 3 3 (Intercept) -0.874
4 1 4 4 (Intercept) -0.777
5 1 5 5 (Intercept) -1.78
6 1 6 6 (Intercept) -2.01
7 1 7 7 (Intercept) -2.13
8 1 8 8 (Intercept) -0.392
9 1 9 9 (Intercept) -1.23
10 1 10 10 (Intercept) -1.14
# ℹ 4,790 more rows
We can then summarise this
# A tibble: 2 × 7
.variable .value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 (Intercept) -1.18 -2.62 0.179 0.95 median hdci
2 log(AREA) 0.826 0.659 0.994 0.95 median hdci
peake.rstanarm4 |>
gather_draws(`.Intercept.*|.*AREA.*`, regex=TRUE) |>
ggplot() +
stat_halfeye(aes(x=.value, y=.variable)) +
facet_wrap(~.variable, scales='free')We could alternatively express the parameters on the response scale.
peake.rstanarm4 |>
gather_draws(`.Intercept.*|.*AREA.*`, regex=TRUE) |>
group_by(.variable) |>
mutate(.value=exp(.value)) |>
median_hdci()# A tibble: 2 × 7
.variable .value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 (Intercept) 0.307 0.0240 1.03 0.95 median hdci
2 log(AREA) 2.29 1.93 2.70 0.95 median hdci
Conclusions:
- the estimated intercept (expected number of individuals for a mussel clump of log-transformed area = 0) is -1.18. This is the median of the posterior distribution for this parameter. If we back-transform this to the response scale, this becomes 0.31. When the mussel clump is 0, we expect to find 0.31 individuals in the mussel clump (which obviously does not exist if it has a log-transformed area of 0!).
- the estimated slope (rate at which the number of individuals changes per 1 unit change in log-transformed area), is 0.83 (mean) with a standard error of 0.66. The 95% credibility intervals indicate that we are 90% confident that the slope is between 0.99 and 0.95 - e.g. there is a significant positive trend. When back-transformed onto the reponse scale, we see that for a one unit increase in log-transformed mussel clump area, the number of individuals increases by a factor of 2.29. This represents a 129% increase per 1 unit change in log-transformed mussel clump area.
- Rhat and number of effective samples for each parameter are also provided as MCMC diagnostics and all look good.
This is purely a graphical depiction on the posteriors.
# A tibble: 2,400 × 12
.chain .iteration .draw `(Intercept)` `log(AREA)` reciprocal_dispersion
<int> <int> <int> <dbl> <dbl> <dbl>
1 1 1 1 -1.09 0.802 2.99
2 1 2 2 -1.15 0.825 6.25
3 1 3 3 -0.874 0.798 5.39
4 1 4 4 -0.777 0.787 4.12
5 1 5 5 -1.78 0.900 6.48
6 1 6 6 -2.01 0.932 5.89
7 1 7 7 -2.13 0.952 8.00
8 1 8 8 -0.392 0.740 3.01
9 1 9 9 -1.23 0.836 5.05
10 1 10 10 -1.14 0.831 5.14
# ℹ 2,390 more rows
# ℹ 6 more variables: accept_stat__ <dbl>, stepsize__ <dbl>, treedepth__ <dbl>,
# n_leapfrog__ <dbl>, divergent__ <dbl>, energy__ <dbl>
# A tibble: 2,400 × 5
.chain .iteration .draw `(Intercept)` `log(AREA)`
<int> <int> <int> <dbl> <dbl>
1 1 1 1 -1.09 0.802
2 1 2 2 -1.15 0.825
3 1 3 3 -0.874 0.798
4 1 4 4 -0.777 0.787
5 1 5 5 -1.78 0.900
6 1 6 6 -2.01 0.932
7 1 7 7 -2.13 0.952
8 1 8 8 -0.392 0.740
9 1 9 9 -1.23 0.836
10 1 10 10 -1.14 0.831
# ℹ 2,390 more rows
## summarised
peake.rstanarm4 |>
spread_draws(`.Intercept.*|.*AREA.*`, regex=TRUE) |>
summarise_draws("median",
~ HDInterval::hdi(.x),
"rhat",
"ess_bulk")# A tibble: 2 × 6
variable median lower upper rhat ess_bulk
<chr> <num> <num> <num> <num> <num>
1 (Intercept) -1.18 -2.62 0.179 0.999 2319.
2 log(AREA) 0.826 0.659 0.994 0.999 2318.
## summarised on fractional scale
peake.rstanarm4 |>
spread_draws(`.Intercept.*|.*AREA.*`, regex=TRUE) |>
mutate(across(everything(), exp)) |>
summarise_draws("median",
~ HDInterval::hdi(.x),
"rhat",
"ess_bulk")# A tibble: 2 × 6
variable median lower upper rhat ess_bulk
<chr> <num> <num> <num> <num> <num>
1 (Intercept) 0.307 0.0240 1.03 0.999 2319.
2 log(AREA) 2.29 1.93 2.70 0.999 2318.
Warning: Method 'posterior_samples' is deprecated. Please see ?as_draws for
recommended alternatives.
# A tibble: 2,400 × 3
`(Intercept)` `log(AREA)` reciprocal_dispersion
<dbl> <dbl> <dbl>
1 -1.09 0.802 2.99
2 -1.15 0.825 6.25
3 -0.874 0.798 5.39
4 -0.777 0.787 4.12
5 -1.78 0.900 6.48
6 -2.01 0.932 5.89
7 -2.13 0.952 8.00
8 -0.392 0.740 3.01
9 -1.23 0.836 5.05
10 -1.14 0.831 5.14
# ℹ 2,390 more rows
The summary() method generates simple summaries (mean, standard deviation as well as 10, 50 and 90 percentiles).
Family: negbinomial
Links: mu = log; shape = identity
Formula: INDIV ~ log(AREA)
Data: peake (Number of observations: 25)
Draws: 3 chains, each with iter = 5000; warmup = 1000; thin = 5;
total post-warmup draws = 2400
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -1.11 0.70 -2.47 0.26 1.00 2337 2212
logAREA 0.82 0.08 0.66 0.98 1.00 2324 2298
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
shape 4.99 1.35 2.72 8.08 1.00 2130 2288
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Conclusions:
- in the Model Info, we are informed that the total MCMC posterior sample size is 2400 and that there were 19 raw observations.
- the estimated intercept (expected number of individuals for a mussel clump of log-transformed area = 0) is -1.11. This is the mean of the posterior distribution for this parameter. If we back-transform this to the response scale, this becomes 0.33. When the mussel clump is 0, we expect to find 0.33 individuals in the mussel clump (which obviously does not exist if it has a log-transformed area of 0!).
- the estimated slope (rate at which the number of individuals changes per 1 unit change in log-transformed area), is 0.82 (mean) or 0.98 (median) with a standard deviation of 0.08. The 90% credibility intervals indicate that we are 90% confident that the slope is between 0.82 and 1 - e.g. there is a significant positive trend. When back-transformed onto the reponse scale, we see that for a one unit increase in log-transformed mussel clump area, the number of individuals increases by a factor of 2.27. This represents a 127% increase per 1 unit change in log-transformed mussel clump area.
- Rhat and number of effective samples for each parameter are also provided as MCMC diagnostics and all look good.
# A tibble: 8 × 8
variable median lower upper rhat length ess_bulk ess_tail
<chr> <num> <num> <num> <num> <num> <num> <num>
1 b_Intercept -1.12 -2.54 0.145 1.00 2400 2337. 2212.
2 b_logAREA 0.820 0.668 0.987 1.00 2400 2324. 2298.
3 shape 4.85 2.59 7.76 1.00 2400 2130. 2288.
4 prior_Intercept 5.95 3.29 9.15 1.00 2400 2474. 2340.
5 prior_b 0.00694 -1.90 1.92 1.00 2400 2351. 2180.
6 prior_shape 1.71 0.0616 4.80 0.999 2400 2444. 2410.
7 lprior -5.86 -8.02 -4.03 1.00 2400 2133. 2328.
8 lp__ -159. -163. -158. 1.00 2400 2388. 2319.
We can also alter the CI level.
peake.brm4 |>
summarise_draws(median,
~HDInterval::hdi(.x, credMass = 0.9),
rhat, length, ess_bulk, ess_tail)# A tibble: 8 × 8
variable median lower upper rhat length ess_bulk ess_tail
<chr> <num> <num> <num> <num> <num> <num> <num>
1 b_Intercept -1.12 -2.17 0.0483 1.00 2400 2337. 2212.
2 b_logAREA 0.820 0.679 0.943 1.00 2400 2324. 2298.
3 shape 4.85 2.91 7.18 1.00 2400 2130. 2288.
4 prior_Intercept 5.95 3.58 8.42 1.00 2400 2474. 2340.
5 prior_b 0.00694 -1.57 1.61 1.00 2400 2351. 2180.
6 prior_shape 1.71 0.0616 3.88 0.999 2400 2444. 2410.
7 lprior -5.86 -7.58 -4.22 1.00 2400 2133. 2328.
8 lp__ -159. -162. -158. 1.00 2400 2388. 2319.
To narrow down to just the parameters of interest, see the code under the tidy_draws tab.
tidyMCMC(peake.brm4$fit, estimate.method='median', conf.int=TRUE, conf.method='HPDinterval', rhat=TRUE, ess=TRUE)# A tibble: 7 × 7
term estimate std.error conf.low conf.high rhat ess
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 b_Intercept -1.12 0.695 -2.54 0.145 1.00 2328
2 b_logAREA 0.820 0.0822 0.668 0.987 1.00 2317
3 shape 4.85 1.35 2.59 7.76 1.00 2214
4 prior_Intercept 5.95 1.50 3.29 9.15 1.00 2453
5 prior_b 0.00694 0.988 -1.90 1.92 1.00 2263
6 prior_shape 1.71 1.41 0.0616 4.80 0.999 2418
7 lprior -5.86 1.09 -8.02 -4.03 1.00 2254
Conclusions:
- the estimated intercept (expected number of individuals for a mussel clump of log-transformed area = 0) is -1.12. This is the median of the posterior distribution for this parameter. If we back-transform this to the response scale, this becomes 0.33. When the mussel clump is 0, we expect to find 0.33 individuals in the mussel clump (which obviously does not exist if it has a log-transformed area of 0!).
- the estimated slope (rate at which the number of individuals changes per 1 unit change in log-transformed area), is 0.82 (mean) with a standard error of 0.08. The 95% credibility intervals indicate that we are 90% confident that the slope is between 0.67 and 0.99 - e.g. there is a significant positive trend. When back-transformed onto the reponse scale, we see that for a one unit increase in log-transformed mussel clump area, the number of individuals increases by a factor of 2.27. This represents a 127% increase per 1 unit change in log-transformed mussel clump area.
- Rhat and number of effective samples for each parameter are also provided as MCMC diagnostics and all look good.
# A draws_df: 800 iterations, 3 chains, and 8 variables
b_Intercept b_logAREA shape prior_Intercept prior_b prior_shape lprior lp__
1 -0.537 0.77 5.1 7.7 0.641 1.32 -6.0 -159
2 -0.223 0.71 4.6 4.9 -0.106 2.58 -5.6 -159
3 -2.505 0.97 5.2 5.8 -1.265 1.95 -6.3 -161
4 -1.311 0.84 5.7 5.6 0.077 0.82 -6.6 -158
5 -1.690 0.88 5.5 7.5 0.769 2.46 -6.5 -159
6 -1.109 0.82 4.2 5.2 -1.838 1.98 -5.4 -159
7 -0.353 0.72 3.7 6.8 -0.675 0.79 -4.9 -160
8 0.017 0.67 5.7 5.7 2.429 2.57 -6.5 -163
9 -1.194 0.81 3.9 7.4 0.774 2.38 -5.2 -160
10 -1.468 0.86 7.2 7.0 0.181 0.65 -7.8 -159
# ... with 2390 more draws
# ... hidden reserved variables {'.chain', '.iteration', '.draw'}
## summarised
peake.brm4 |>
as_draws_df() |>
summarise_draws(
median,
~ HDInterval::hdi(.x),
length,
rhat,
ess_bulk, ess_tail
)# A tibble: 8 × 8
variable median lower upper length rhat ess_bulk ess_tail
<chr> <num> <num> <num> <num> <num> <num> <num>
1 b_Intercept -1.12 -2.54 0.145 2400 1.00 2337. 2212.
2 b_logAREA 0.820 0.668 0.987 2400 1.00 2324. 2298.
3 shape 4.85 2.59 7.76 2400 1.00 2130. 2288.
4 prior_Intercept 5.95 3.29 9.15 2400 1.00 2474. 2340.
5 prior_b 0.00694 -1.90 1.92 2400 1.00 2351. 2180.
6 prior_shape 1.71 0.0616 4.80 2400 0.999 2444. 2410.
7 lprior -5.86 -8.02 -4.03 2400 1.00 2133. 2328.
8 lp__ -159. -163. -158. 2400 1.00 2388. 2319.
## summarised on fractional scale
peake.brm4 |>
as_draws_df() |>
dplyr::select(starts_with("b_")) |>
## mutate(across(everything(), exp)) |>
exp() |>
summarise_draws(median,
HDInterval::hdi,
rhat,
length,
ess_bulk, ess_tail)Warning: Dropping 'draws_df' class as required metadata was removed.
# A tibble: 2 × 8
variable median lower upper rhat length ess_bulk ess_tail
<chr> <num> <num> <num> <num> <num> <num> <num>
1 b_Intercept 0.325 0.0329 1.02 1.00 2400 2332. 2186.
2 b_logAREA 2.27 1.92 2.64 1.00 2400 2319. 2289.
Return the draws (samples) for each parameter in wide format
# A tibble: 2,400 × 17
.chain .iteration .draw b_Intercept b_logAREA shape prior_Intercept prior_b
<int> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 1 -0.537 0.765 5.12 7.69 0.641
2 1 2 2 -0.223 0.709 4.60 4.94 -0.106
3 1 3 3 -2.51 0.972 5.19 5.77 -1.27
4 1 4 4 -1.31 0.844 5.67 5.65 0.0766
5 1 5 5 -1.69 0.876 5.53 7.53 0.769
6 1 6 6 -1.11 0.825 4.24 5.23 -1.84
7 1 7 7 -0.353 0.723 3.69 6.78 -0.675
8 1 8 8 0.0172 0.665 5.73 5.68 2.43
9 1 9 9 -1.19 0.810 3.94 7.38 0.774
10 1 10 10 -1.47 0.859 7.18 7.01 0.181
# ℹ 2,390 more rows
# ℹ 9 more variables: prior_shape <dbl>, lprior <dbl>, lp__ <dbl>,
# accept_stat__ <dbl>, treedepth__ <dbl>, stepsize__ <dbl>,
# divergent__ <dbl>, n_leapfrog__ <dbl>, energy__ <dbl>
[1] "b_Intercept" "b_logAREA" "shape" "prior_Intercept"
[5] "prior_b" "prior_shape" "lprior" "lp__"
[9] "accept_stat__" "treedepth__" "stepsize__" "divergent__"
[13] "n_leapfrog__" "energy__"
peake.brm4$fit |>
tidy_draws() |>
dplyr::select(matches('^b_.*')) |>
exp() |>
summarise_draws(median,
HDInterval::hdi,
rhat, length, ess_bulk, ess_tail)# A tibble: 2 × 8
variable median lower upper rhat length ess_bulk ess_tail
<chr> <num> <num> <num> <num> <num> <num> <num>
1 b_Intercept 0.325 0.0329 1.02 1.00 2400 2332. 2186.
2 b_logAREA 2.27 1.92 2.64 1.00 2400 2319. 2289.
The above can be useful for exploring the full posteriors of all the parameters of performing specific calculations using the posteriors, summarising the parameters takes a few more steps.
# A tibble: 14 × 7
.variable .value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 accept_stat__ 2.61e+ 0 2.08e+ 0 2.72e+ 0 0.95 median hdci
2 b_Intercept 3.25e- 1 3.29e- 2 1.02e+ 0 0.95 median hdci
3 b_logAREA 2.27e+ 0 1.92e+ 0 2.64e+ 0 0.95 median hdci
4 divergent__ 1 e+ 0 1 e+ 0 1 e+ 0 0.95 median hdci
5 energy__ 7.98e+69 5.57e+68 2.92e+71 0.95 median hdci
6 lp__ 5.75e-70 1.09e-73 1.57e-69 0.95 median hdci
7 lprior 2.85e- 3 1.31e- 5 1.10e- 2 0.95 median hdci
8 n_leapfrog__ 1.10e+ 3 2.01e+ 1 1.10e+ 3 0.95 median hdci
9 prior_Intercept 3.84e+ 2 2.93e+ 0 4.59e+ 3 0.95 median hdci
10 prior_b 1.01e+ 0 4.84e- 2 4.88e+ 0 0.95 median hdci
11 prior_shape 5.53e+ 0 1.03e+ 0 1.21e+ 2 0.95 median hdci
12 shape 1.27e+ 2 4.45e+ 0 1.50e+ 3 0.95 median hdci
13 stepsize__ 2.01e+ 0 1.93e+ 0 2.14e+ 0 0.95 median hdci
14 treedepth__ 7.39e+ 0 7.39e+ 0 2.01e+ 1 0.95 median hdci
The gather_draws on the other hand, conveniently combines tidy_draws and gather_variables together in a single command. Furthermore, it returns all of the variables. The spread_draws function allows users to target specific variables (either by naming them in full or via regexp).
Due to the presence of a log transform in the predictor, it is better to use the regex version.
[1] "b_Intercept" "b_logAREA" "shape" "prior_Intercept"
[5] "prior_b" "prior_shape" "lprior" "lp__"
[9] "accept_stat__" "treedepth__" "stepsize__" "divergent__"
[13] "n_leapfrog__" "energy__"
# A tibble: 4,800 × 5
# Groups: .variable [2]
.chain .iteration .draw .variable .value
<int> <int> <int> <chr> <dbl>
1 1 1 1 b_Intercept -0.537
2 1 2 2 b_Intercept -0.223
3 1 3 3 b_Intercept -2.51
4 1 4 4 b_Intercept -1.31
5 1 5 5 b_Intercept -1.69
6 1 6 6 b_Intercept -1.11
7 1 7 7 b_Intercept -0.353
8 1 8 8 b_Intercept 0.0172
9 1 9 9 b_Intercept -1.19
10 1 10 10 b_Intercept -1.47
# ℹ 4,790 more rows
We can then summarise this
# A tibble: 2 × 7
.variable .value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 b_Intercept -1.12 -2.54 0.145 0.95 median hdci
2 b_logAREA 0.820 0.668 0.987 0.95 median hdci
peake.brm4 |>
gather_draws(`b_.*`, regex=TRUE) |>
mutate(.value = exp(.value)) |>
summarise_draws(median,
HDInterval::hdi,
rhat, length, ess_bulk, ess_tail)# A tibble: 2 × 9
# Groups: .variable [2]
.variable variable median lower upper rhat length ess_bulk ess_tail
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 b_Intercept .value 0.325 0.0329 1.02 1.00 2400 2337. 2212.
2 b_logAREA .value 2.27 1.92 2.64 1.00 2400 2324. 2298.
peake.brm4 |>
gather_draws(`b_.*`, regex=TRUE) |>
ggplot() +
stat_halfeye(aes(x=.value, y=.variable)) +
facet_wrap(~.variable, scales='free')We could alternatively express the parameters on the response scale.
peake.brm4 |>
gather_draws(`.Intercept.*|.*AREA.*`, regex=TRUE) |>
group_by(.variable) |>
mutate(.value=exp(.value)) |>
median_hdci()# A tibble: 1 × 7
.variable .value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 b_logAREA 2.27 1.92 2.64 0.95 median hdci
Conclusions:
- the estimated intercept (expected number of individuals for a mussel clump of log-transformed area = 0) is -1.12. This is the median of the posterior distribution for this parameter. If we back-transform this to the response scale, this becomes 0.33. When the mussel clump is 0, we expect to find 0.33 individuals in the mussel clump (which obviously does not exist if it has a log-transformed area of 0!).
- the estimated slope (rate at which the number of individuals changes per 1 unit change in log-transformed area), is 0.82 (mean) with a standard error of 0.67. The 95% credibility intervals indicate that we are 90% confident that the slope is between 0.99 and 0.95 - e.g. there is a significant positive trend. When back-transformed onto the reponse scale, we see that for a one unit increase in log-transformed mussel clump area, the number of individuals increases by a factor of 2.27. This represents a 127% increase per 1 unit change in log-transformed mussel clump area.
- Rhat and number of effective samples for each parameter are also provided as MCMC diagnostics and all look good.
# A tibble: 2,400 × 5
.chain .iteration .draw b_Intercept b_logAREA
<int> <int> <int> <dbl> <dbl>
1 1 1 1 -0.537 0.765
2 1 2 2 -0.223 0.709
3 1 3 3 -2.51 0.972
4 1 4 4 -1.31 0.844
5 1 5 5 -1.69 0.876
6 1 6 6 -1.11 0.825
7 1 7 7 -0.353 0.723
8 1 8 8 0.0172 0.665
9 1 9 9 -1.19 0.810
10 1 10 10 -1.47 0.859
# ℹ 2,390 more rows
## summarised
peake.brm4 |>
as_draws_df() |>
dplyr::select(starts_with("b_")) |>
summarise_draws("median",
~ HDInterval::hdi(.x),
"rhat",
"ess_bulk")Warning: Dropping 'draws_df' class as required metadata was removed.
# A tibble: 2 × 6
variable median lower upper rhat ess_bulk
<chr> <num> <num> <num> <num> <num>
1 b_Intercept -1.12 -2.54 0.145 1.00 2332.
2 b_logAREA 0.820 0.668 0.987 1.00 2319.
## summarised on fractional scale
peake.brm4 |>
as_draws_df() |>
dplyr::select(starts_with("b_")) |>
mutate(across(everything(), exp)) |>
summarise_draws("median",
~ HDInterval::hdi(.x),
"rhat",
"ess_bulk")Warning: Dropping 'draws_df' class as required metadata was removed.
# A tibble: 2 × 6
variable median lower upper rhat ess_bulk
<chr> <num> <num> <num> <num> <num>
1 b_Intercept 0.325 0.0329 1.02 1.00 2332.
2 b_logAREA 2.27 1.92 2.64 1.00 2319.
Warning: Method 'posterior_samples' is deprecated. Please see ?as_draws for
recommended alternatives.
# A tibble: 2,400 × 8
b_Intercept b_logAREA shape prior_Intercept prior_b prior_shape lprior lp__
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 -0.537 0.765 5.12 7.69 0.641 1.32 -6.03 -159.
2 -0.223 0.709 4.60 4.94 -0.106 2.58 -5.59 -159.
3 -2.51 0.972 5.19 5.77 -1.27 1.95 -6.29 -161.
4 -1.31 0.844 5.67 5.65 0.0766 0.817 -6.55 -158.
5 -1.69 0.876 5.53 7.53 0.769 2.46 -6.47 -159.
6 -1.11 0.825 4.24 5.23 -1.84 1.98 -5.39 -159.
7 -0.353 0.723 3.69 6.78 -0.675 0.786 -4.91 -160.
8 0.0172 0.665 5.73 5.68 2.43 2.57 -6.49 -163.
9 -1.19 0.810 3.94 7.38 0.774 2.38 -5.18 -160.
10 -1.47 0.859 7.18 7.01 0.181 0.647 -7.84 -159.
# ℹ 2,390 more rows
11 Hypothesis testing
Unfortunately, the inline log-transformation of the predictor interfers with hypothesis()’s ability to find the slope. We can get around this by renaming before calling hypothesis().
Hypothesis Tests for class :
Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob Star
1 (lAREA) > 0 0.83 0.09 0.68 0.97 Inf 1 *
---
'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
'*': For one-sided hypotheses, the posterior probability exceeds 95%;
for two-sided hypotheses, the value tested against lies outside the 95%-CI.
Posterior probabilities of point hypotheses assume equal prior probabilities.
Conclusions:
- the probability of a positive effect of mussel clump area on number of individuals is 1. That is, there is, we are 100% confident that there is an effect.
- the evidence ratio is normally the ratio of the number of cases that satisty the hypothesis to the number of cases that do not. Since the number of cases that do not satisfy the hypothesis is 0, the evidence ratio is Inf (since division by 0)
Alternatively…
Conclusions:
- the probability of a positive effect of mussel clump area on number of individuals is 1. That is, there is, we are 100% confident that there is an effect.
newdata = list(AREA=c(5000, 10000))
## fractional scale
peake.rstanarm4 |> emmeans(~AREA, at=newdata, type = 'response') |> pairs(reverse = TRUE)Warning: Model has 0 prior weights, but we recovered 25 rows of data.
So prior weights were ignored.
contrast ratio lower.HPD upper.HPD
AREA10000 / AREA5000 1.77 1.58 1.99
Point estimate displayed: median
Results are back-transformed from the log scale
HPD interval probability: 0.95
## absolute scale
peake.rstanarm4 |> emmeans(~AREA, at=newdata) |> regrid() |> pairs(reverse = TRUE)Warning: Model has 0 prior weights, but we recovered 25 rows of data.
So prior weights were ignored.
contrast estimate lower.HPD upper.HPD
AREA10000 - AREA5000 271 186 377
Point estimate displayed: median
HPD interval probability: 0.95
Warning: Model has 0 prior weights, but we recovered 25 rows of data.
So prior weights were ignored.
Conclusions:
- the change in expected number of individuals associated with an increase in mussel clump area from 5,000 to 10,000 is 0.57.
If we want to derive other properties, such as the percentage change, then we use tidy_draws() and then simple tidyverse spreadsheet operation.
peake.mcmc <- peake.rstanarm4 |> emmeans(~AREA, at=newdata) |>
regrid() |>
tidy_draws() |>
rename_with(~str_replace(., 'AREA ', 'p')) |>
mutate(Eff=p10000 - p5000,
PEff=100*Eff/p5000)Warning: Model has 0 prior weights, but we recovered 25 rows of data.
So prior weights were ignored.
# A tibble: 6 × 7
.chain .iteration .draw p5000 p10000 Eff PEff
<int> <int> <int> <dbl> <dbl> <dbl> <dbl>
1 1 1 1 311. 543. 231. 74.4
2 1 2 2 358. 634. 276. 77.2
3 1 3 3 374. 650. 276. 73.9
4 1 4 4 373. 644. 271. 72.5
5 1 5 5 360. 672. 312. 86.7
6 1 6 6 376. 717. 341. 90.8
Now we can calculate the medians and HPD intervals of each column (and ignore the .chain, .iteration and .draw).
# A tibble: 7 × 5
term estimate std.error conf.low conf.high
<chr> <dbl> <dbl> <dbl> <dbl>
1 .chain 2 0.817 1 3
2 .iteration 400. 231. 1 761
3 .draw 1200. 693. 1 2281
4 p5000 354. 35.7 288. 428.
5 p10000 626. 78.3 498. 800.
6 Eff 271. 50.3 186. 377.
7 PEff 77.3 10.7 57.9 99.2
Alternatively, we could use median_hdci
# A tibble: 1 × 6
PEff .lower .upper .width .point .interval
<dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 77.3 57.9 99.2 0.95 median hdci
Conclusions:
- the percentage change in expected number of individuals associated with an increase in mussel clump area from 5,000 to 10,000 is 77.33% with a 95% credibility interval of 57.91 99.17
To get the probability that the effect is greater than a 50% increase.
Conclusions:
- the probability that the number of individuals will increase by more than 50% following an increase in mussel clump area from 5,000 to 10,000 is 0.99.
Finally, we could alternatively use hypothesis(). Note that when we do so, the estimate is the difference between the effect and the hypothesised effect (50%).
Hypothesis Tests for class :
Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
1 (PEff)-(50) > 0 27.66 10.67 10.75 45.35 170.43 0.99
Star
1 *
---
'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
'*': For one-sided hypotheses, the posterior probability exceeds 95%;
for two-sided hypotheses, the value tested against lies outside the 95%-CI.
Posterior probabilities of point hypotheses assume equal prior probabilities.
## Note, the P value is on absolute difference
newdata = list(AREA=c(5000, 10000))
peake.rstanarm4 |>
emmeans(~AREA, at=newdata) |>
regrid() |>
pairs(reverse = TRUE) |>
tidy_draws() |>
summarise(across(contains('contrast'),
list(P = ~ mean(.>50),
HDCI = ~ median_hdci(.)),
.names = c('{.fn}')
)) |>
tidyr::unpack(HDCI)Warning: Model has 0 prior weights, but we recovered 25 rows of data.
So prior weights were ignored.
# A tibble: 1 × 7
P y ymin ymax .width .point .interval
<dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 1 271. 186. 377. 0.95 median hdci
We can also express this as a percentage change
newdata = list(AREA=c(5000, 10000))
## Simple
peake.rstanarm4 |>
emmeans(~AREA, at=newdata) |>
pairs(reverse = TRUE) |>
regrid() Warning: Model has 0 prior weights, but we recovered 25 rows of data.
So prior weights were ignored.
contrast ratio lower.HPD upper.HPD
AREA10000/AREA5000 1.77 1.58 1.99
Point estimate displayed: median
HPD interval probability: 0.95
## More advanced (both P and percent change)
peake.mcmc <- peake.rstanarm4 |>
emmeans(~AREA, at=newdata) |>
pairs(reverse = TRUE) |>
regrid() |>
tidy_draws() |>
mutate(across(contains('contrast'), ~ 100*(. - 1)))Warning: Model has 0 prior weights, but we recovered 25 rows of data.
So prior weights were ignored.
peake.mcmc |>
summarise(across(contains('contrast'),
list(P = ~ mean(.>50),
HDCI = ~ median_hdci(.)),
.names = c('{.fn}')
)) |>
tidyr::unpack(HDCI)# A tibble: 1 × 7
P y ymin ymax .width .point .interval
<dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 0.994 77.3 57.9 99.2 0.95 median hdci
peake.rstanarm4 |>
linpred_draws(newdata = as.data.frame(newdata)) |>
mutate(.linpred = exp(.linpred)) |>
ungroup() |>
group_by(.draw) |>
summarise(Eff = diff(.linpred),
PEff = 100*Eff/.linpred[1]) |>
ungroup() |>
mutate(P = mean(PEff>50)) |>
pivot_longer(cols = -.draw) |>
group_by(name) |>
median_hdci()# A tibble: 3 × 7
name value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 Eff 271. 186. 377. 0.95 median hdci
2 P 0.994 0.994 0.994 0.95 median hdci
3 PEff 77.3 57.9 99.2 0.95 median hdci
##OR
peake.rstanarm4 |>
epred_draws(newdata = as.data.frame(newdata)) |>
ungroup() |>
group_by(.draw) |>
summarise(Eff = diff(.epred),
PEff = 100*Eff/.epred[1]) |>
ungroup() |>
mutate(P = mean(PEff>50)) |>
pivot_longer(cols = -.draw) |>
group_by(name) |>
median_hdci()# A tibble: 3 × 7
name value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 Eff 271. 186. 377. 0.95 median hdci
2 P 0.994 0.994 0.994 0.95 median hdci
3 PEff 77.3 57.9 99.2 0.95 median hdci
##OR for prediction of individual values
peake.rstanarm4 |>
predicted_draws(newdata = as.data.frame(newdata)) |>
ungroup() |>
group_by(.draw) |>
summarise(Eff = diff(.prediction),
PEff = 100*Eff/.prediction[1]) |>
ungroup() |>
mutate(P = mean(PEff>50)) |>
pivot_longer(cols = -.draw) |>
group_by(name) |>
median_hdci()# A tibble: 3 × 7
name value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 Eff 239 -344 1065 0.95 median hdci
2 P 0.592 0.592 0.592 0.95 median hdci
3 PEff 75.4 -80.2 518. 0.95 median hdci
Hypothesis Tests for class b:
Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob Star
1 (logAREA) > 0 0.82 0.08 0.69 0.95 Inf 1 *
---
'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
'*': For one-sided hypotheses, the posterior probability exceeds 95%;
for two-sided hypotheses, the value tested against lies outside the 95%-CI.
Posterior probabilities of point hypotheses assume equal prior probabilities.
Conclusions:
- the probability of a positive effect of mussel clump area on number of individuals is 1. That is, there is, we are 100% confident that there is an effect.
- the evidence ratio is normally the ratio of the number of cases that satisty the hypothesis to the number of cases that do not. Since the number of cases that do not satisfy the hypothesis is 0, the evidence ratio is Inf (since division by 0)
Alternatively…
Conclusions:
- the probability of a positive effect of mussel clump area on number of individuals is 1. That is, there is, we are 100% confident that there is an effect.
newdata = list(AREA=c(5000, 10000))
## fractional scale
peake.brm4 |> emmeans(~AREA, at=newdata, type = 'response') |> pairs(reverse = TRUE) contrast ratio lower.HPD upper.HPD
AREA10000 / AREA5000 1.77 1.57 1.96
Point estimate displayed: median
Results are back-transformed from the log scale
HPD interval probability: 0.95
contrast estimate lower.HPD upper.HPD
AREA10000 - AREA5000 270 193 372
Point estimate displayed: median
HPD interval probability: 0.95
Conclusions:
- the change in expected number of individuals associated with an increase in mussel clump area from 5,000 to 10,000 is 0.57.
If we want to derive other properties, such as the percentage change, then we use tidy_draws() and then simple tidyverse spreadsheet operation.
peake.mcmc <- peake.brm4 |> emmeans(~AREA, at=newdata) |>
regrid() |>
tidy_draws() |>
rename_with(~str_replace(., 'AREA ', 'p')) |>
mutate(Eff=p10000 - p5000,
PEff=100*Eff/p5000)
peake.mcmc |> head()# A tibble: 6 × 7
.chain .iteration .draw p5000 p10000 Eff PEff
<int> <int> <int> <dbl> <dbl> <dbl> <dbl>
1 1 1 1 395. 672. 277. 69.9
2 1 2 2 334. 547. 212. 63.4
3 1 3 3 321. 630. 309. 96.1
4 1 4 4 357. 640. 283. 79.5
5 1 5 5 321. 588. 268. 83.5
6 1 6 6 372. 658. 287. 77.1
Now we can calculate the medians and HPD intervals of each column (and ignore the .chain, .iteration and .draw).
# A tibble: 7 × 5
term estimate std.error conf.low conf.high
<chr> <dbl> <dbl> <dbl> <dbl>
1 .chain 2 0.817 1 3
2 .iteration 400. 231. 1 761
3 .draw 1200. 693. 1 2281
4 p5000 353. 34.7 296. 426.
5 p10000 623. 75.0 494. 780.
6 Eff 270. 47.7 193. 372.
7 PEff 76.6 10.1 57.2 96.1
Alternatively, we could use median_hdci
# A tibble: 1 × 6
PEff .lower .upper .width .point .interval
<dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 76.6 57.2 96.1 0.95 median hdci
Conclusions:
- the percentage change in expected number of individuals associated with an increase in mussel clump area from 5,000 to 10,000 is 76.55% with a 95% credibility interval of 57.2 96.13
To get the probability that the effect is greater than a 50% increase.
Conclusions:
- the probability that the number of individuals will increase by more than 50% following an increase in mussel clump area from 5,000 to 10,000 is
Finally, we could alternatively use hypothesis(). Note that when we do so, the estimate is the difference between the effect and the hypothesised effect (50%).
Hypothesis Tests for class :
Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
1 (PEff)-(50) > 0 26.84 10.09 10.89 43.47 479 1
Star
1 *
---
'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
'*': For one-sided hypotheses, the posterior probability exceeds 95%;
for two-sided hypotheses, the value tested against lies outside the 95%-CI.
Posterior probabilities of point hypotheses assume equal prior probabilities.
## Note, the P value is on absolute difference
newdata = list(AREA=c(5000, 10000))
peake.brm4 |>
emmeans(~AREA, at=newdata) |>
regrid() |>
pairs(reverse = TRUE) |>
tidy_draws() |>
summarise(across(contains('contrast'),
list(P = ~ mean(.>50),
HDCI = ~ median_hdci(.)),
.names = c('{.fn}')
)) |>
tidyr::unpack(HDCI)# A tibble: 1 × 7
P y ymin ymax .width .point .interval
<dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 1 270. 193. 372. 0.95 median hdci
We can also express this as a percentage change
newdata = list(AREA=c(5000, 10000))
## Simple
peake.brm4 |>
emmeans(~AREA, at=newdata) |>
pairs(reverse = TRUE) |>
regrid() contrast ratio lower.HPD upper.HPD
AREA10000/AREA5000 1.77 1.57 1.96
Point estimate displayed: median
HPD interval probability: 0.95
## More advanced (both P and percent change)
peake.mcmc <- peake.brm4 |>
emmeans(~AREA, at=newdata) |>
pairs(reverse = TRUE) |>
regrid() |>
tidy_draws() |>
mutate(across(contains('contrast'), ~ 100*(. - 1)))
peake.mcmc |>
summarise(across(contains('contrast'),
list(P = ~ mean(.>50),
HDCI = ~ median_hdci(.)),
.names = c('{.fn}')
)) |>
tidyr::unpack(HDCI)# A tibble: 1 × 7
P y ymin ymax .width .point .interval
<dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 0.998 76.6 57.2 96.1 0.95 median hdci
peake.brm4 |>
linpred_draws(newdata = as.data.frame(newdata)) |>
mutate(.linpred = exp(.linpred)) |>
ungroup() |>
group_by(.draw) |>
summarise(Eff = diff(.linpred),
PEff = 100*Eff/.linpred[1]) |>
ungroup() |>
mutate(P = mean(PEff>50)) |>
pivot_longer(cols = -.draw) |>
group_by(name) |>
median_hdci()# A tibble: 3 × 7
name value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 Eff 270. 193. 372. 0.95 median hdci
2 P 0.998 0.998 0.998 0.95 median hdci
3 PEff 76.6 57.2 96.1 0.95 median hdci
##OR
peake.brm4 |>
epred_draws(newdata = as.data.frame(newdata)) |>
ungroup() |>
group_by(.draw) |>
summarise(Eff = diff(.epred),
PEff = 100*Eff/.epred[1]) |>
ungroup() |>
mutate(P = mean(PEff>50)) |>
pivot_longer(cols = -.draw) |>
group_by(name) |>
median_hdci()# A tibble: 3 × 7
name value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 Eff 270. 193. 372. 0.95 median hdci
2 P 0.998 0.998 0.998 0.95 median hdci
3 PEff 76.6 57.2 96.1 0.95 median hdci
##OR for prediction of individual values
peake.brm4 |>
predicted_draws(newdata = as.data.frame(newdata)) |>
ungroup() |>
group_by(.draw) |>
summarise(Eff = diff(.prediction),
PEff = 100*Eff/.prediction[1]) |>
ungroup() |>
mutate(P = mean(PEff>50)) |>
pivot_longer(cols = -.draw) |>
group_by(name) |>
median_hdci()# A tibble: 3 × 7
name value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 Eff 241 -412 965 0.95 median hdci
2 P 0.592 0.592 0.592 0.95 median hdci
3 PEff 75.0 -81.7 460. 0.95 median hdci
12 Summary figure
## Using emmeans
peake.grid = with(peake, list(AREA = seq(min(AREA), max(AREA), len=100)))
newdata = emmeans(peake.rstanarm4, ~AREA, at=peake.grid, type='response') |> as.data.frame()Warning: Model has 0 prior weights, but we recovered 25 rows of data.
So prior weights were ignored.
AREA prob lower.HPD upper.HPD
462.2500 49.17637 31.01453 72.41448
731.7629 72.05937 48.03657 99.58929
1001.2759 93.52104 66.37835 125.65270
1270.7888 114.00802 85.57218 150.80201
1540.3017 133.58711 103.05017 173.39565
1809.8146 152.66106 119.22524 194.18054
Point estimate displayed: median
Results are back-transformed from the log scale
HPD interval probability: 0.95
ggplot(newdata, aes(y=prob, x=AREA)) +
geom_point(data=peake, aes(y=INDIV)) +
geom_line() +
geom_ribbon(aes(ymin=lower.HPD, ymax=upper.HPD), fill='blue', alpha=0.3) +
scale_y_continuous('Individuals') +
scale_x_continuous('Mussel clump area') +
theme_classic()## spaghetti plot
newdata = emmeans(peake.rstanarm4, ~AREA, at=peake.grid) |>
regrid() |>
gather_emmeans_draws()Warning: Model has 0 prior weights, but we recovered 25 rows of data.
So prior weights were ignored.
# A tibble: 6 × 5
# Groups: AREA [1]
AREA .chain .iteration .draw .value
<dbl> <int> <int> <int> <dbl>
1 462. NA NA 1 46.1
2 462. NA NA 2 50.2
3 462. NA NA 3 55.9
4 462. NA NA 4 57.4
5 462. NA NA 5 42.2
6 462. NA NA 6 40.8
ggplot(newdata, aes(y=.value, x=AREA)) +
geom_line(aes(group=.draw), colour = 'orange', alpha=0.01) +
geom_point(data=peake, aes(y=INDIV)) +
scale_y_continuous('Number of Individuals') +
scale_x_continuous(expression(Mussel~clump~area~(per~mm^2))) +
theme_classic()## Using emmeans
peake.grid = with(peake, list(AREA = seq(min(AREA), max(AREA), len=100)))
newdata = emmeans(peake.brm4, ~AREA, at=peake.grid, type='response') |> as.data.frame()
head(newdata) AREA prob lower.HPD upper.HPD
462.2500 49.93476 32.37069 73.86866
731.7629 73.10684 50.37555 101.05399
1001.2759 94.55213 67.08377 125.23143
1270.7888 115.12309 83.09912 147.17406
1540.3017 134.78309 101.30133 170.42664
1809.8146 153.94110 119.07761 191.48688
Point estimate displayed: median
Results are back-transformed from the log scale
HPD interval probability: 0.95
ggplot(newdata, aes(y=prob, x=AREA)) +
geom_point(data=peake, aes(y=INDIV)) +
geom_line() +
geom_ribbon(aes(ymin=lower.HPD, ymax=upper.HPD), fill='blue', alpha=0.3) +
scale_y_continuous('Individuals') +
scale_x_continuous('Mussel clump area') +
theme_classic()## spaghetti plot
newdata = emmeans(peake.brm4, ~AREA, at=peake.grid) |>
regrid() |>
gather_emmeans_draws()
newdata |> head()# A tibble: 6 × 5
# Groups: AREA [1]
AREA .chain .iteration .draw .value
<dbl> <int> <int> <int> <dbl>
1 462. NA NA 1 63.9
2 462. NA NA 2 61.9
3 462. NA NA 3 31.8
4 462. NA NA 4 47.8
5 462. NA NA 5 39.8
6 462. NA NA 6 52.1
ggplot(newdata, aes(y=.value, x=AREA)) +
geom_line(aes(group=.draw), colour = 'orange', alpha=0.01) +
geom_point(data=peake, aes(y=INDIV)) +
scale_y_continuous('Number of Individuals') +
scale_x_continuous(expression(Mussel~clump~area~(per~mm^2))) +
theme_classic()